A Taxonomy, Dataset, and Classifier for Automatic Noun Compound Interpretation
نویسندگان
چکیده
The automatic interpretation of noun-noun compounds is an important subproblem within many natural language processing applications and is an area of increasing interest. The problem is difficult, with disagreement regarding the number and nature of the relations, low inter-annotator agreement, and limited annotated data. In this paper, we present a novel taxonomy of relations that integrates previous relations, the largest publicly-available annotated dataset, and a supervised classification method for automatic noun compound interpretation.
منابع مشابه
Automatic Noun Compound Interpretation using Deep Neural Networks and Word Embeddings
The present paper reports on the results of automatic noun compound interpretation for English using a deep neural network classifier and a selection of publicly available word embeddings to represent the individual compound constituents. The task at hand consists of identifying the semantic relation that holds between the constituents of a compound (e.g. WHOLE+PART_OR_MEMBER_OF in the case of ...
متن کاملA Dataset for Joint Noun-Noun Compound Bracketing and Interpretation
We present a new, sizeable dataset of noun– noun compounds with their syntactic analysis (bracketing) and semantic relations. Derived from several established linguistic resources, such as the Penn Treebank, our dataset enables experimenting with new approaches towards a holistic analysis of noun–noun compounds, such as jointlearning of noun–noun compounds bracketing and interpretation, as well...
متن کاملNoun Compound Interpretation Using Paraphrasing Verbs: Feasibility Study
The paper addresses an important challenge for the automatic processing of English written text: understanding noun compounds’ semantics. Following Downing (1977) [1], we define noun compounds as sequences of nouns acting as a single noun, e.g., bee honey, apple cake, stem cell, etc. In our view, they are best characterised by the set of all possible paraphrasing verbs that can connect the targ...
متن کاملOn Why Coarse Class Classification is a Bottleneck for Noun Compound Interpretation
Sequences of long nouns, i.e., noun compounds, occur frequently and are productive. Their interpretation is important for a variety of tasks located at various layers of NLP. Major reasons behind the poor performance of automatic noun compound interpretation are: (a) lack of a well defined inventory of semantic relations and (b) non-availability of sufficient, annotated, high-quality dataset. T...
متن کاملParaphrasing Verbs for Noun Compound Interpretation
An important challenge for the automatic analysis of English written text is the abundance of noun compounds: sequences of nouns acting as a single noun. In our view, their semantics is best characterized by the set of all possible paraphrasing verbs, with associated weights, e.g., malaria mosquito is carry (23), spread (16), cause (12), transmit (9), etc. Using Amazon’s Mechanical Turk, we col...
متن کامل